function observation
Towards Scalable Bayesian Optimization via Gradient-Informed Bayesian Neural Networks
Makrygiorgos, Georgios, Ip, Joshua Hang Sai, Mesbah, Ali
Bayesian optimization (BO) is a widely used method for data-driven optimization that generally relies on zeroth-order data of objective function to construct probabilistic surrogate models. These surrogates guide the exploration-exploitation process toward finding global optimum. While Gaussian processes (GPs) are commonly employed as surrogates of the unknown objective function, recent studies have highlighted the potential of Bayesian neural networks (BNNs) as scalable and flexible alternatives. Moreover, incorporating gradient observations into GPs, when available, has been shown to improve BO performance. However, the use of gradients within BNN surrogates remains unexplored. By leveraging automatic differentiation, gradient information can be seamlessly integrated into BNN training, resulting in more informative surrogates for BO. We propose a gradient-informed loss function for BNN training, effectively augmenting function observations with local gradient information. The effectiveness of this approach is demonstrated on well-known benchmarks in terms of improved BNN predictions and faster BO convergence as the number of decision variables increases.
Scalable Bayesian Optimization with Sparse Gaussian Process Models
Bayesian optimization forms a set of powerful tools that allows efficient black-box optimization and has been applied in a large variety of fields. In this thesis we first seek to advance Bayesian optimization by using estimated derivative observations. Later, we seek to tackle down the issues in Bayesian optimization when a large number of derivative observations and/or function observations are present. We start to describe our motivations in Chapter 1. We then give a broad review of Bayesian optimization in Chapter 2, where we start by covering the history of Bayesian optimization and its components.